The spatial structure of networks 
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We study networks that connect points in geographic space, such as transportation networks 
and the Internet. We find that there are strong signatures in these networks of topography and 
use patterns, giving the networks shapes that are quite distinct from one another and from non- 
geographic networks. We offer an explanation of these differences in terms of the costs and benefits of 
transportation and communication, and give a simple model based on the Monte Carlo optimization 
of these costs and benefits that reproduces well the qualitative features of the networks studied. 
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There has in the last few years been considerable in- 
terest within the physics community in the analysis and 
modeling of networked systems including the world wide 
web, the Internet, and biological, social, and infrastruc- 
ture networks [jj, |2, y] ■ Some of these networks, such as 
biochemical networks and citation networks, exist only in 
an abstract "network space" where the precise positions 
of the network nodes have no particular meaning. But 
many others, such as the Internet, live in the real space of 
everyday experience, with nodes (e.g., computers in the 
case of the Internet) having well-defined positions. Most 
previous studies have ignored the geography of networks, 
concentrating instead on other issues. Here we argue 
that geography matters greatly, and to ignore it is to 
miss some of these systems' most interesting features. 

A network in its simplest form is a set of nodes or 
vertices joined together in pairs by lines or edges. We 
consider networks in which the vertices occupy particular 
positions in space. The edges in these networks are often 
real physical constructs, such as roads or railway lines in 
transportation networks \4 L optical fiber or other con- 
nections in the Internet [J (J , cables in a power grid , 
or oil pipelines || . In other cases the edges may be more 
ephemeral, such as flights between airports 0, business 
relationships between companies [lfl ] , or wireless commu- 
nications jl lj . 

Interest in the spatial structure of networks dates back 
to the economic geography movement of the 1960s 0, 
ll3T ] and particularly the work of Kansky . Early work 
was hampered however by limited data and computing 
resources, and geographers' attention moved on after a 
while to other topics. Networks have come back into 
the limelight in recent years, particularly as a result of 
interest among physicists, but spatial aspects have not 
received much attention. The best known theoretical 
models of networks either make no reference to space at 

aii mm, or they place vertices on simple regular lat- 
tices whose structure is quite different from that of real 
systems 0,0]. The successes of these models — which are 
considerable — have been in their ability to predict topo- 
logical measures such as graph diameters, degree distri- 
butions, and clustering coefficients. Empirical studies of 
networks, even networks in which geog raphy pl ays a piv- 
otal role, have, with some exceptions 0, Ha, UJJ l2fj > sim- 



ilarly focused almost exclusively on topology [J, |21j, |22| . 

In this paper we look at three specific networks, par- 
ticularly emphasizing their spatial form. The three net- 
works are the Internet, a road network, and a network 
of passenger flights operated by a major airline. To 
make comparison between the networks easier we limit 
our studies to the United States, and we exclude Alaska 
and Hawaii to avoid problems of disjoint maps. 

The first of our three networks is the Internet. We ex- 
amine the network in which the vertices are autonomous 
systems (ASes) and the edges are data connections be- 
tween them (technically, direct-peering relationships). 
The topology of the connections between ASes can be in- 
ferred from routing tables. In our studies we have made 
use of the collection of routing tables compiled by the 
University of Oregon's Route Views project j^. To de- 
termine the geographical parameters of the network we 
use Net Geo |24j, a software tool that can return approx- 
imate latitude and longitude for a specified AS. Combin- 
ing these two resources a geographic map of the Inter- 
net was created, from which were then deleted all nodes 
falling outside the lower 48 states. This leaves a network 
of 7049 nodes and 13 831 edges for data from March 2003. 

Our second network is the US interstate highway net- 
work in which the vertices represent intersections, termi- 
nation points of highways, and country borders, and the 
edges represent highways. Vertex positions and edges 
were extracted from GIS databases. For data from 
the year 2000 the network has 935 vertices and 1337 
edges. Our third network, the airline network, is simi- 
larly straightforward. In this network the vertices rep- 
resent airports and there is an edge between every pair 
of airports connected by a scheduled flight. The particu- 
lar case we study is the published schedule of flights for 
Delta Airlines for February 2003, for which there are 187 
vertices and 825 edges. Geographic locations of airports 
were found from standard directories. 

We focus initially in our analysis of these networks on 
three fundamental properties: edge lengths, network di- 
ameter, and vertex degrees. In Fig. ^ we show the distri- 
bution of the lengths in kilometers of edges in each of our 
networks. Common to all three networks is a clear bias 
towards shorter edges, which is unsurprising since long 
edges are presumably more expensive to create and main- 
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FIG. 1: Histograms of the lengths of edges in the three net- 
works studied here. 

tain than short ones. When we look more closely, how- 
ever, the networks show some striking differences. The 
road network has only very short edges, on the order of 
10km to 100km, while the Internet and airline network 
have much longer ones. The latter two networks also 
both have bimodal distributions, with a large fraction 
of edges of length 2000km or less, and then a smaller 
but distinct peak of longer edges around 4000km [28| . 
(These are continent-spanning edges, like coast-to-coast 
flights in the airline network.) 

Simple Euclidean distance between vertices is not the 
only measure of distance in a network however. Another 
commonly used measure is the so-called graph distance, 
which measures the number of edges traversed along the 
shortest path from one vertex to another — the number 
of "legs" of air travel, for instance, or the number of 
"hops" an Internet data packet would make. The largest 
graph distance between any two points in a network is 
called the graph diameter, and it varies widely between 
our networks. For the highway network for example the 
diameter is 61, but it is just 8 for the Internet, even 
though the latter network has far more vertices. And 
for the airline network the diameter is only 3. In the 
jargon of the networks literature, the Internet and the 
airline network form "small worlds," while the interstate 
network does not. 

Euclidean edge lengths and graph distances are not un- 
related: in a graph like the road network, which is com- 
posed mainly of short edges, one will need to traverse a 
lot of such edges to make a long journey, so we would ex- 
pect the diameter to be large. Conversely, the presence of 
even just a few long edges makes for much smaller diame- 



ters, as demonstrated recently by Watts and Strogatz 
Thus there seems to be a pay-off between Euclidean dis- 
tance and number of legs in a journey, an idea that we 
exploit below to help explain the observed structure of 
our networks. 

Another way in which our networks differ is in the 
degrees of their vertices. (The degree of a vertex is the 
number of edges connected to it.) The highest degree 
of any vertex in the highway network is 4, which means 
that the best connected vertex links directly to only 0.4% 
of other vertices. In the airline network by contrast, the 
maximum degree is 141 or 76% of the network, while 
for the Internet it is 2139 or 30%. High-degree vertices 
that connect to a significant fraction of the rest of the 
network are commonly called "hubs" ; the airline network 
and Internet thus both contain at least one hub (in fact 
each contains several), whereas the road network contains 
none [29j . 

We would like to understand how the observed struc- 
ture of our networks is related to their geographical na- 
ture, and the origin of the marked differences between 
the networks. We present two approaches that shed light 
on these questions. The first is empirical in nature, the 
second theoretical. 

At the empirical level, many of the features we ob- 
serve in these networks can be explained in terms of spa- 
tial dimension. Each of our networks is of course two- 
dimensional in a geographic sense, since it lives on the 
two-dimensional surface of the Earth. However, one can 
also ask about the effective dimension of the network 
itself [23 • We find that, in a sense we will shortly de- 
fine, the Internet and airline networks are not really two- 
dimensional at all, but the road network is. 

The road network is, in fact, almost planar. That is, 
it can be drawn on a map without any edges crossing. 
This automatically gives it a two-dimensional form and 
helps us to understand why its edges are so short: if edges 
are not allowed to cross then they cannot travel far be- 
fore they run into one another. It also goes some way 
towards explaining the network's low vertex degrees: it 
can be proved that the mean degree k of a planar graph is 
strictly less than 6 [2^| and indeed we find that the mean 
degree of the road network is k = 2.86. For the airline 
network on the other hand k = 8.82, so this network 
cannot be planar. This is not an entirely persuasive ar- 
gument however. The Internet has mean degree k = 3.93, 
which is not large enough to rule out planarity, and the 
highway network is actually not perfectly planar, having 
a small number of road crossings so that rigorous demon- 
strations of planarity such as Kuratowski's theorem j2|| 
or the Hopcroft-Tarjan planarity algorithm fail. We 
would like, therefore, some other more flexible way of 
probing the dimension of our networks. We propose the 
following. 

On an infinite regular d-dimensional lattice, such as 
a square or cubic lattice, the dimension d can be calcu- 
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FIG. 2: The size of neighborhoods vs. their radius on doubly- 
logarithmic plots (a) for interstate highways, (b) for the In- 
ternet, (c) and (d) for simulations based on the optimization 
model described in the text. The straight lines have slope 
2 and indicate the expected growth for two-dimensional net- 
works. 



lated from d = lirrv^oo d log N v (r)/d log r where N v (r) is 
the number of vertices r steps or less from a given ver- 
tex v [2oll25| . On finite lattices one cannot take the limit 
r — > 00, but good results for d can be achieved by plotting 
logN v against logr for some central vertex v and mea- 
suring the slope of the initial part of the resulting line. 
This idea can be used also to define an effective dimen- 
sion for networks. (In order to reduce statistical errors, 
N v is averaged over all vertices v, but in other respects 
the calculation is identical.) We show the resulting plots 
for the interstate network and the Internet in Fig. [2 pan- 
els (a) and (b). As the figure shows, the slope of the plot 
is close to 2 for the interstates, indicating that this net- 
work is essentially two-dimensional. For the Internet on 
the other hand, the plot grows much faster with r, indi- 
cating that the network has high dimension, or perhaps 
no well-defined dimension at all (similar results are seen 
for the airline network). 

If a network is fundamentally two-dimensional, then 
we would expect it to have a diameter that, like any 
two-dimensional system, varies as the square root of the 
network size. Essentially all other networks, by contrast, 
have diameters varying much more slowly, usually loga- 
rithmically with network size. Thus, we propose a tenta- 
tive explanation of the structure of our geographic net- 
works as follows. All the networks appear to show a 
preference for short edges over long ones, which is a nat- 
ural effect of geography. However, the road network has 
much shorter edges, lower degrees, and larger diameter 
than the other two. These are all expected consequences 



of a two-dimensional or planar form, and when we mea- 
sure dimension we do indeed find that the road network 
is fundamentally two-dimensional, while the other net- 
works are not. 

This is a satisfying finding, certainly, but to some ex- 
tent it just passes the intellectual buck: our measure- 
ments can be explained in terms of network dimension- 
ality, but why do the networks have different dimension 
in the first place? As we now show, it is possible to con- 
struct a simple model that explains the basic features of 
geographic networks, including their dimension, in terms 
of competing preferences for either short Euclidean dis- 
tances between vertices or short graph distances. 

First, let us assume that the cost of building and main- 
taining a network is proportional to the total length of 
all its edges: 

cost= ^2 dij, (1) 

edges 

where dij is the Euclidean length of the edge between 
vertices i and j. This result is only approximately true 
in most cases, but it is a plausible starting point. 

From a user's perspective, a network will usually be 
better if the paths between points are shorter. As we 
have seen, however, the way we measure path length can 
vary. In a road network most travelers look for routes 
that are short in terms of miles, while for airline travelers 
the number of legs is often considered more important. 
To account for these differences, we assign to each edge 
an effective length thus: 

effective length of edge — X\^n dij + (1 — A), (2) 

where < A < 1 and n is the number of vertices. The pa- 
rameter A determines the user's preference for measuring 
distance in terms of miles or legs. (The factor of s/n is 
not strictly necessary but it is convenient; it compensates 
for the scaling of nearest-neighbor distances dij ~ n -1 / 2 
with system size.) Now we define the total distance be- 
tween two (not necessarily adjacent) vertices to be the 
sum of the effective lengths of all the edges along a path 
between them, minimized over all paths. 

We now construct a model network as follows. We 
suppose we are given the positions of n vertices that we 
are to connect, we are given a budget, Eq. QJ, for build- 
ing the network, and we are given the preference of the 
users, meaning we are given a value of A. We then search 
for network structures that connect all the vertices, can 
be built within budget, and minimize the mean vertex- 
vertex distance between all vertex pairs, for edge lengths 
defined as above. This is a standard combinatorial opti- 
mization problem, for which we can derive good (though 
usually not perfect) solutions using simulated annealing. 

Fig-Elshows four networks generated in this fashion for 
n = 50 vertices placed at random within a square. For 
A = and A = 1 we find networks strongly reminiscent of 
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FIG. 3: Optimized network structures for (a) A = 0, (b) A = 
|, (c) A = |, (d) A = 1. Networks (a) and (d) resemble 
airline and road networks respectively, while (b) and (c) show 
structure intermediate between the two extremes. 



airlines and roads respectively — tree-like structures with 
long edges and hubs in the first case and structures with 
neither long edges nor hubs in the second. For interme- 
diate values of A the model finds a compromise between 
hub formation and local links. 

To make this comparison more concrete, we have also 
generated networks with the same mean degrees as our 
three empirically observed networks. For n — 200 nodes, 
we find that the maximum degree of the model networks 
varies between 7 (3.5% of the network) and 143 (71.5%) 
as we vary A from to 1 . At the same time, the diameter 
decreases from a sizable 21 to a small-world-like 4. In 
Fig. [3; and 01 we show the mean size of the neighbor- 
hood N v (r) of a vertex as a function of distance r, as 
we did for our empirically observed networks. As the fig- 
ure shows, the results indicate a network with a roughly 
two-dimensional form for large A (Fig.[2t) and a strongly 
super-quadratic form for small A (Fig. |3i). All of these 
results are in excellent agreement with our empirical ob- 
servations for the real airline and road networks. 

We propose therefore that the qualitative features of 
spatial networks can be well represented by a simple one- 
parameter family of networks balancing miles traveled 
with number of legs between vertex pairs. Typical road 
networks have the structure one would expect if their 
users care primarily about the length of their journey 
in miles, while airline networks correspond to users who 
care primarily about minimizing the number of legs. 

The results presented here are, inevitably, only the be- 
ginnings of a detailed study of spatial networks. Many 



other features of these networks deserve scrutiny, such 
as, for instance, the effects of population distribution. 
We hope that others will also investigate this interesting 
class of systems and look forward with anticipation to 
their results. 
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